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DETAILED ACTION 



Response to Arguments 



1 . Applicant's arguments with respect to claims 1-28 have been considered but are 
moot in view of the new ground(s) of rejection. 



2. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

3. Claims 1,3,4,15 are rejected under 35 U.S.C. 1 03(a) as being unpatentable 
over Van Schyndel (5,940,118). 

As per claims 1,15, Van Schyndel discloses: 

a method of providing a user (110, FIG. 5) with one or more visual indications, in 
accordance with a display system associated with the user, of who is currently speaking 
during an event in which the user is engaged, the event including one or more other 
individuals, the method comprising the steps of: 



Claim Rejections - 35 USC § 103 



• identifying the location of the individual who is currently speaking during the 
event (20, FIG. 5); 
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• determining whether the individual identified as the current speaker is within a 
field of audible perception of the user (50, FIG. 5 and Col. 9, lines 4-6 - the 
directional microphone is moved in the direction of imminent speakers only if 
they are not within the audible perception of the user as determined by the 
system. When the user is teleconferencing, the user's audible field of 
perception, in the broadest sense, is defined by what he can hear through the 
teleconferencing system and his field of view is defined by the position and 
the field of view of the camera in the conference room); 

• displaying a first visual indicator to the user, in accordance with the display 
system, in association with the individual identified as the current speaker 
when the individual is within the field of view of the user (105, FIG. 5 and Col. 
7, lines 35-46, Col. 9, lines 22-36, i.e., the video and audio showing the 
individual who is speaking when camera 30 is pointing at that individual and 
transmission of this data to 1 05); 

( preferred t *%\ W4 » vw e v\"0 

Van SchyndekJoes not disclose (1) the use of determining whether the individual 
identified as the current speaker is within a field of view of the user and (2) displaying a 
second visual indicator to the user, in accordance with the display system, when the 
individual identified as the current speaker is not within the field of view of the user. 

However, Van Schyndel ( preferred embod i ment) does disclose that it is well- 
known to direct a camera at a determined sound source to provide coordinated 
video/audio and eliminate the need for human operator (See Col. 2, lines 27-34). 
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It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to modify Van Schyndel in order to determine whether the current 
speaker is within the field of view of the user as well as the field of audible perception 
and then direct the camera toward the speaker for display to the user in Van Schyndel 
(i.e. displaying a second visual indicator, via the monitor 105 that would show 
redirecting of the camera) as suggested by Van Schyndel as being well-known, the 
motivation being to provide coordinated video/audio and eliminate the need for a human 
operator. 



As per claim 3, Van Schyndel discloses: 

• capturing one or more video images of the one or more individuals participating 
in the event (Col. 7, lines 35-36); 

• analyzing the one or more captured video images to determine which individual 
has one or more facial features indicative of speech (Col.7, lines 39-41); 

• designating the individual with the one or more facial features indicative of 
speech as the current speaker ("current talker", Col. 7, lines 42-43); 

• and determining the location of the individual designated as the current speaker 
(Col. 7, lines 45-50). 

As per claim 4, Van Schyndel teaches capturing one or more video images of the 
field of the user (step 245, FIG. 2). Since the field of view of the user of the 
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teleconferencing system in Van Schyndel is always the view through the center camera, 
the position of the current talker is determined as an offset from the central camera 
position, and the microphones are moved accordingly. (FIG. 4a and Col. 7, lines 50-55) 

4. Claims 2, 16, 17 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Van Schyndel as applied to claim 1 above, and further in view of Budd et al. (6,222,677) 

Van Schyndel discloses a stationary display system (105, FIG. 5) Van 
Schyndel does not disclose the use of display system worn by the user. 

Budd et al. teach head-mounted display system. (FIG. 1) 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to modify Van Schyndel as taught by Budd et al. in order to to 
provide the user with an ability to view the conference while having the freedom to move 
around the office because the display system is light and conveniently wearable. (Col. 
2, lines 30-35) 

5. Claims 6-8 are rejected under 35 U.S.C. 103(a) as being unpatentable over Van 
Schyndel as applied to claim 1 above, and further in view of Potts et al. (6,593,956) 



As per claim 6, Van Schyndel discloses: 
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• capturing audio data of the one or more individuals participating in the event 
(inherent during the use of microphones, 60, FIG. 5); 

• analyzing the audio data to determine which individual is uttering sound 
indicative of speech (steering microphone towards the direction of incoming 
sound, Col. 5, lines 5-8 and step 335, FIG. 3); 

• designating the individual uttering sound that is indicative of speech as the 
current speaker, (determining whether the "imminent talker" is currently 
speaking, Col. 9, lines 10-19) 

Van Schyndel does not disclose the use of determining the location of the 
individual designated as the current speaker using audio information. 

Potts et al. teach determining the location of the speaker based on audio 
information (114, FIG. 4 and Col. 17, lines 35-55) 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to modify Van Schyndel as taught by Potts et al. in order to improve 
the speaker detection, since the combined use of both audio and video detection 
modules to identify the current speaker would improve the overall reliability of the 
system (Col. 4, lines 25-29) 

As per claims 7-8, Van Schyndel teaches capturing directional data associated 
with display system and positional data associated with the user (step 245, FIG. 2). 
Since the field of view of the user of the teleconferencing system in Van Schyndel is 
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always the view through the center camera, the position of the current talker is 
determined as an offset from the central camera position, and the microphones are 
moved accordingly. (FIG. 4a and Col. 7, lines 50-55) . 

6. Claims 9-11, 18-20 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Van Schyndel as applied to claim 1 above, and further in view of Hein et al. 
(6,466,250) 

Van Schyndel does not disclose that a "first visual indicator comprises a marker 
displayed in proximity to a representation of the individual identified as the current 
speaker on the display system." 

Hein et al. teach using a colored frame or a shared cursor (marker) around the 
speaker's image in order to identify him to the viewer where the frame can change 
color or change background (Col. 7, lines 8-11). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to modify Van Schydel et al. as taught by Hein et al. in order to 
identify the current speaker located in the field of view of the user, because, as it is well- 
known in the art, the indicator would point out the current speaker from a group of 
people (Hein et al., Col. 3, lines 7-9) 

7. Claims 13-14, 22 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Van Schyndel as applied to claim 1 above, and further in view of Butnaru et al. 
(6,240, 392) 
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Van Schydel does not disclose obtaining textual transcription of audio content 
through either human stenography or speech recognition, and displaying textual 
transcription on the display system. 

Butnaru et al. discloses the system that is capable of recognizing the speech 
using speech recognizer (elem. 55, FIG. 3) and displaying the text content to the user 
via a display system (Col. 5, lines 49-57). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to modify Van Schydel et al. as taught by Butnaru et al. to enable 
the deaf people using the system to participate in the videoconferencing or other forms 
of telecommunications. (Col. 1 , line 63 - Col. 2, line 8) 



Allowable Subject Matter 

8. Claims 5, 12, 21, are objected to as being dependent upon a rejected base claim, 
but would be allowable if rewritten in independent form including all of the limitations of 
the base claim and any intervening claims. 

9. Independent claim 23 is allowed. Claims 24-28 are allowed as dependent on 
claim 23 and further limiting its scope. 

The following is a statement of reasons for the indication of allowable subject 
matter: prior art does not teach or suggest displaying the second indicator to a user, 
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directing him to turn his head in the direction of the speaker (claims 12, 21 ) and 
displaying the first visual indicator in such a way that the images of the current speakers 
and field of view of the user are combined (claim 5). Claim 23 combines all of these 
individual limitations, and hence is also allowable. 

Conclusion 

10. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

Spitzer (WO 99/23524) teaches an eyeglass interface system. 

A white-paper by Personal Captioning System ("Live Theater Captioning System") 
teaches the use of glasses in conjunction with audio/video hardware. 

Michael Brandstein (published 1998), "Real-Time Face Tracking Using Audio and 
Image Data" teaches a system similar to that of Potts et al., in addition showing a facial 
outline of the tracked face. 

1 1 . Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Dmitry Brant whose telephone number is (703) 305- 
8954. The examiner can normally be reached on Mon. - Fri. (8:30am - 5pm). 
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If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Talivaldis Ivars Smits can be reached on (703) 306-3011. The fax phone 
number for the organization where this application or proceeding is assigned is (703) 
872-9306. 

Any inquiry of a general nature or relating to the status of this application or 
proceeding should be directed to Tech Center 2600 receptionist whose telephone 
number is (703) 305- 4700. 



DB 

8/5/04 




PRIMARY EXAMINER 



